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Abstract- since the advent of high throughput methodologies, like microarrays, the load of genomic data has increased geometrically and 
along with that the need for computational methods which will interpret these data. In the present work we have studied the common gene 
expression patterns between two tumor cell types of mesodermal origin. In particular, we have attempted to find causal relations between 
gene expression levels with respect to chromosomal location. We have found that several genes manifested significant relations, using 
regression analysis and as such they could pose interesting targets for further investigations. This type of analysis can lead to the 
understanding of the common mechanisms that transform physiological cells to malignant, as well as it reveals a new holistic way to 
understand the dynamics of tumor onset as well as the mechanistic of oncogenic drivers. Such approaches could prove to be useful in the 
prediction of genomic targets that could be further studied in order to unravel the mechanics of tumor ontogenesis. 
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I. Introduction 

Acute lymphoblastic leukemia (ALL) and rhabdomyosarcoma (RMS) are two type of tumors, which originate from the 
embryonic mesoderm. ALL is the most frequent malignancy, which appears during childhood. Acute leukemia originates 
from the undifferentiated lymphoblast, which does not develop into the mature lymphoid cell, giving rise to a tumor. RMS is 
a rare cancer in childhood. This represents 5-8% of all tumors in childhood, but the sarcomas of head and neck are 12% of all 
neoplasias in childhood. RMS originates from myoblast or cells that will form the skeletal muscle. These are different from 
the smooth cells. RMS can be created at any part of the body, which has skeletal muscle but frequently appears in the head 
and neck. The embryonal form of RMS is most common at birth, which consists of spindle cells and botryoid form with 
better prognosis, but the alveolar form mainly appears in childhood and adolescence. ALL and RMS consist of cells that are 
undifferentiated, immortal and with the potential to divide infinitely. Myoblasts originate from the dorsal (paraxial) 
mesoderm, while blood cells derive from the lateral mesoderm that gives rise to the splachnic mesoderm and this to the 
hemangioblastic tissue. ALL and many alveolar RMS in childhood present chromosomal translocations, like the PAX3- 
FKHR, that is an indicator of poor prognosis and associated with metastasis [1]. 

During embryogenesis, blood cells originate from two sites. From the ventral mesoderm near the yolk sac, which gives rise 
to the intra-embryonic hematopoietic precursors, but the hematopoietic cells that last throughout the entire life time of an 
organism are derived from the mesodermal area surrounding the aorta. This differentiation is regulated with a network of 
various genes, which leads to two similar cell types with others functions and roles in the body. There are several factors that 
affect gene regulation, which means that aberrations in the regulatory network would lead to tumor cells. 

Further on, it has been previously reported that correlation in gene expression and in particular, chromosomal correlation 
implies common gene regulation [1, 2]. Yet, it is also known that correlation does not imply causality. In that sense, it is of 
great importance if we would be able to infer gene regulatory mechanisms from chromosomal expression levels. 

The present study concerns the extension of a previous work, where gene expression analysis was extensively investigated in 
two cell lines. The T-cell acute lymphoblastic leukemia CCRF-CEM cell line and the rhabdomyosarcoma TE-671 cell line. 
We used these cell lines to investigate common patterns of cellular function between the two systems. These two cell types 
would normally had differentiated into cells of blood and muscle cells but in an unknown stage this normal differentiation 
stopped and began an informal, uncontrolled proliferation which created the malignancy. The intervening stages and led in 
tumor genesis are unknown. 
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Thus, in the present work we attempted to investigate common gene regulation based on chromosomal correlations. Yet, we 
have tried to move a step forward and find probable etiological relations in gene expression patterns. The present work 
focuses towards approaches that utilize biological along with mathematical tools, in order to unravel mechanisms in gene 
expression patterns. Further on, it moves towards a direction that attempts to interpret gene regulation not from the 
differential point of view but from common gene expression patterns. 

II. Materials and Methods 


2.1 Cell Cultures 


The CCRF-CEM (ALL) and the TE-671 (RMS) cell lines were used as the model, both obtained from the European 
Collection of Cell Cultures (EC ACC). The CCRF-CEM cell line, a CD4 + [3] and CD34 + presenting cell line [4], was initially 
obtained from the peripheral blood of a 2 year old Caucasian female. It was diagnosed as lymphosarcoma which progressed 
later on to acute leukemia [5]. The child was undergone irradiation therapy and chemotherapy prior to obtaining the cell line. 
Although remission was achieved at various stages, the disease progressed rapidly [5]. The cell line was observed to have 
undergone minor changes after long-term culture, except for the presence of dense granules in the nucleoli [6]. Finally, the 
CCRF-CEM cell line has been reported to manifest autocrine catalase activity which participates to its mechanisms of growth 
and progression [7]. The TE-671 cell line was initially reported to have been obtained from a cerebellar medulloblastoma, 
before irradiation therapy, of a six-year old Caucasian female [8] and characterized later on [9]. However, it is today known 
that this cell line is parental if not identical to the RD [10] rhabdomyosarcoma cell line. However, several reports still refer to 
this cell line as medulloblastoma [11, 12]. The experimental process of cell treatment has been described previously [1]. 

2.2 Experimental Procedures 

Experimental procedures have been described previously in detail by our group [1]. In brief, factors estimated, included cell 
proliferation, cell cycle distribution and microarray experimentation. For the assay of mRNA levels two sets of microarray 
chips were used: cDNA microarray chips (4.8k genes) obtained from TAKARA (IntelliGene™ II Human CHIP 1) [13] and 
microarray chips (9.6k genes) from the Institut fuer Molekularbiologie und Tumorforschung, Microarray Core Facility of the 
Philipps -Uni versitaet, Marburg Germany (IMT9.6k). Microarray experimentation has been previously described in our 
previous work [1]. The microarray data have been submitted to the GEO Database under the Accession Number GSE34522. 

2.3 Microarray Data Analysis 

Microarray data pre-processing analysis was performed with ImaGene®v.6.0Software (BioDiscovery Inc, CA) and 
ARMADA software (National Hellenic Research Foundation, Athens Greece) [14]. Data were collected from exported text 
file and data pre-processing was performed using the Microsoft Excel® environment. Background correction has been 
performed by using the robust loess-based background correction(rLsBC) approach, as proposed by Sifakis et al. (201 1) [15]. 
The background corrected signal intensities were further normalized in order to mitigate the effect of extraneous, non- 
biological variation in the measured gene expression levels. Genes were tested for their significance in differential expression 
using a z-test. Genes were considered to be significantly differentially expressed if they obtained a p-value <0.05. The False 
Discovery Rate (FDR) was calculated as previously described [16-18]. There was a FDR of 1% for p<0.05 for the 
IntelliGene microarray chip, and a FDR of 9% for p<0.01 for the IMT 9.6k microarray chip. Calculating the FDR for the 
combination of both platforms gives a FDR of 6% for p<0.01. 

2.4 Chromosomal Regressions and Data Analysis 

Chromosome mapping was performed with Genesis 1.7.2 (Technische Universitaet-Graz, Austria) using Pearson’s 
correlation, Spearman’s rank order correlation [2, 19, 20] and WebGestalt web-tool (Vanderbilt University, The Netherlands, 
http://bioinfo.vanderbilt.edu/gotm/ ) [21]. 2D Chromosomal regressions have been performed among the similarly expressed 
genes between RMS (TE-671) and ALL (CCRF-CEM). 3D chromosomal regressions were performed between similarly 
expressed genes of RMS and ALL as well as we added the third dimension, which was the log 2 transformed ratio of the 
respective genes. Regressions and data analysis were performed with the Matlab® computational environment (The 
Mathworksm Inc. Natick, MA). For linear 2D regressions the binomial equation was used of the form f(x)=y=ax 2 +bx+c , 
while for the 3D regressions the equation of the form f(x,y)=z=ay+bx+c. 
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III. Results 

Regression analysis was performed in order to find probable causal relations in gene expression with respect to cell type and 
chromosomal distribution. In particular, we have used genes that manifested similar expression levels, meaning that they 
were those that were not differentially expressed. We have performed two different kinds of regression analysis. One 
included a 2D regression of the form y-ax +bx+c and the second of the form z=ay+bx+c. 2D regressions manifested 
significant relations between the two cell types and in particular for chromosome 1 with an R 2 = 0.82 (Fig. 1A), chromosome 
3 with an R 2 = 0.82 (Fig. IB), chromosome 8 with an R 2 =0.89 (Fig. 1C), chromosome 13 with an R 2 =0.71 (Fig. ID), 
chromosome 14 with an R 2 =0.91 (Fig. IE), chromosome 15 with an R 2 = 0.69 (Fig. IF), chromosome 16 with an R 2 =0.89 
(Fig. 1G), chromosome 17 with an R 2 =0.81 (Fig. 1H), chromosome 21 with an R 2 =0.88 (Fig. II), chromosome 22 with an 
R 2 = 0.97 (Fig. 1J) and chromosome X with an R 2 =0.71 (Fig. IK). 



Figure 1. Regressions of gene expression data between CCRF-CEM and TE-671 cells, with respect to 
CHROMOSOMAL EXPRESSION. SIGNIFICANT RELATIONS WERE FOUND FOR CHROMOSOMES 1 (A), 3 (B), 8 (C), 13 (D), 14 

(E), 15 (F), 16 (G), 17 (H), 21 (I), 22 (J) and X (K). 


Although 2D regressions manifested significant relations with regard to gene expression, still the revealed functions did not 
have the complete set of the definition of a function. In other words, for every xED(f), there are two y, which can be 
described as 3 xj, x 2 E D(f): x^x 2 ^>f(x 1 )=f(x 2 ), or yj=y 2 , where D(f) is the definition domain of/, and x, y are the expression 
levels of each gene. The question that came up from this observation, was that the manifested correlations could provide 
more information if another dimension could be added to the calculations. The solution to this question came from the 
observation that the log 2 transformed ratio of TE-671 gene expression levels over the CCRF-CEM gene expression levels, 
could be added as the third dimension. Thus, we have formulated a new relation z=ay+bx+c ox f(x,y)=ay+bx+c, wherez can 
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Figure 2. 3D Regressions of gene expression data between CCRF-CEM and TE-671 cells and their 

RESPECTIVE LOG2 TRANSFORMED RATIO WITH RESPECT TO CHROMOSOMAL EXPRESSION. SIGNIFICANT RELATIONS WERE 
FOUND FOR CHROMOSOMES 1 (A) WITH BIHARMONIC FUNCTION, 1 (B) WITH A LINEAR FUNCTION, 7 (C), 9 (D), 10 (E), 1 1 

(F), 13 (G), 21 (H) andX (I). 


In particular, 3D regression analysis manifested significant relations between the two cell types and their ratio, and in 
particular for chromosome 1 with an R 2 = 1 (Fig. 2A) and with the use of the biharmonic function, chromosome 1 with an 
R 2 =0.11 (Fig. 2B) with the use of the aforementioned linear function, chromosome 7 with an /? 2 =0.98 (Fig. 2C), 
chromosome 9 with an R 2 = 0.96 (Fig. 2D), chromosome 10 with an /? 2 =0.81 (Fig. 2E), chromosome 11 with an R 2 = 0.92 (Fig. 
2F), chromosome 13 with an R 2 = 0.85 (Fig. 2G), chromosome 21 with an R 2 =0.12 (Fig. 2H) and chromosome X with an 
R 2 = 0.87 (Fig. 21). 

It appeared that relations in three dimensions manifested better results for the specified chromosomes as compared to two 
dimensional regressions. This observation, shows that probably a third dimension is necessary in order to comprehend gene 
regulatory mechanisms. 


IV. Discussion 

In the present study we used computational and mathematical approaches in order to answer the question whether, a causal 
relation between gene expression levels of two cell types, could be found based on their chromosomal distribution. We have 
found that such significant relations could be found both using 2D as well as 3D regression methods. Especially, 2D 
regressions manifested very good relations among cell types with respect to chromosomes, yet without manifesting a clear 
causal relation. On the other hand, 3D regressions manifested better results suggesting that causal multi -dimensional relations 
probably exist. At the same time, the fact that certain genes and chromosomes manifest significant correlations between two 
different cell types, yet of similar origin, implies that they could probably consist common regulatory mechanisms of tumor 
ontogenesis. In our previous work we have reported that ANXA4 (Chromosome 2), NP25 (Chromosome 3), VEGFC 
(Chromosome 4), PDLIM7 and THBS4 (Chromosome 5), CUL7 (Chromosome 6), CD36 (Chromosome 7), BNIP3L 


Page | 158 


International Journal of Engineering Research & Science (IJOER) 


ISSN: [2395-6992] 


[Vol-3, Issue-3, March- 2017] 


(Chromosome 8), CDC2 and IL2RA (Chromosome 10), MAB21L1 and FOXOl (Chromosome 13), TCL1A (Chromosome 
14), FEM1B (Chromosome 15), CFDP1 and MMP2 (Chromosome 16), RTTN (Chromosome 18), PDCD5 (Chromosome 
19), , NNAT and ZNF313 (Chromosome 20), MBNL3, PLAC1, RPS6KA3 and CD40LG (Chromosome X) were genes 
found to participate in cell differentiation and embryonal processes [1]. These genes were found to manifest significant 
correlations both in the 2D as well as in 3D regressions. This finding consist of an important observation since it is probable 
that these genes could possess a common machinery, either through their gene regulation or aberrations, that drives tumor 
ontogenesis for the two cell types under investigation. 


V. Conclusion 

The present approach attempted to find common regulatory mechanisms in gene expression patterns of two cell types. In 
particular, we have attempted to discover possible causal relations in gene expression patterns. We have found evidence that 
such correlations could exist and it is probable that certain genes could be of great significance in tumor ontogenesis. Such 
approaches could prove to be useful in the prediction of genomic targets that could be further studied in order to unravel the 
mechanics of tumor ontogenesis. 
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